Caterpillars, context, tree automata and tree pattern matching
نویسندگان
چکیده
We present a novel, yet simple, technique for the speciication of context in struc-tured documents that we call caterpillar expressions. Although we are primarily applying this technique in the speciication of context-dependent style sheets for HTML, SGML and XML documents, it can also be used for query speciication for structured documents, as we shall demonstrate, and for the speciication of computer program transformations. From a conceptual point of view, structured documents are trees, and one of the oldest and best-established techniques to process trees and, hence, structured documents are tree automata. We present a number of theoretical results that allow us to compare the expressive power of caterpillar expressions and caterpillar au-tomata, their companions, to the expressive power of tree automata. In particular, we demonstrate that each caterpillar expression describes a regular tree language that is, hence, recognizable by a tree automaton. Finally, we employ caterpillar expressions for tree pattern matching. We demonstrate that caterpillar automata are able to solve tree-pattern-matching problems for some, but not all, types of tree inclusion that Kilpell ainen investigated in his PhD thesis. In simulating tree pattern matching with caterpillar automata, we reprove some of Kilpell ainen's results in a uniform framework.
منابع مشابه
Caterpillars: A Context Specification Technique
We present a novel, yet simple, technique for the speciication of context in structured documents that we call caterpillar expressions. Although we are primarily applying this technique in the speciication of context-dependent style sheets for HTML, SGML and XML documents, it can also be used for query speciica-tion for structured documents, as we shall demonstrate, and for the speciication of ...
متن کاملMultidimensional fuzzy finite tree automata
This paper introduces the notion of multidimensional fuzzy finite tree automata (MFFTA) and investigates its closure properties from the area of automata and language theory. MFFTA are a superclass of fuzzy tree automata whose behavior is generalized to adapt to multidimensional fuzzy sets. An MFFTA recognizes a multidimensional fuzzy tree language which is a regular tree language so that for e...
متن کاملAn Automata-Based Approach to Pattern Matching
Due to its importance in security, syntax analysis has found usage in many high-level programming languages. The Lisp language has its share of operations for evaluating regular expressions, but native parsing of Lisp code in this way is unsupported. Matching on lists requires a significantly more complicated model, with a different programmatic approach than that of string matching. This work ...
متن کاملA missing link in root-to-frontier tree pattern matching
Tree pattern matching (tpm) algorithms play an important role in practical applications such as compilers and XML document validation. Many tpm algorithms based on tree automata have appeared in the literature. For reasons of efficiency, these automata are preferably deterministic. Deterministic root-to-frontier tree automata (drftas) are less powerful than nondeterministic ones, and no root-to...
متن کاملOptimal Left-to-Right Pattern-Matching Automata
We propose a practical technique to compile pattern-matching for prioritised overlapping patterns in equational languages into a minimal, deterministic, left-toright, matching automaton. First, we present a method for constructing a tree matching automaton for such patterns. This allows pattern-matching to be performed without any backtracking. Space requirements are reduced by using a directed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999